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Abstract 

We are concerned with the problem of maximizing the worst-case lifetime of a data-gathering wireless sensor network 
, consisting of a set of sensor nodes directly communicating with a base-station. We propose to solve this problem by modeling 

. sensor node and base-station communication as the interactive communication between multiple correlated informants (sensor 

' nodes) and a recipient (base-station). We provide practical and scalable interactive communication protocols for data gathering in 

' sensor networks and demonstrate their efficiency compared to traditional approaches. 

, In this paper, we first develop a formalism to address the problem of worst-case interactive communication between a set of 

^ . multiple correlated informants and a recipient. We realize that there can be different objectives to achieve in such a communication 

_ O ' scenario and compute the optimal number of messages and bits exchanged to realize these objectives. Then, we propose a formalism 

' to adapt these results in the context of single-hop data-gathering sensor networks. Finally, based on this proposed formalism, we 

propose a clustering based communication protocol for large sensor networks and demonstrate its superiority over a traditional 
clustering protocol. 

I. Introduction 

Many future and extant sensor networks feature tiny sensor nodes with modest energy resources, processing power, and 
^ communication abilities. A key networking challenge is to devise protocols and architectures that can provide relatively long 
O operational sensor network lifetimes, in spite of these limitations. We define network lifetime as the time until the first sensor 
node or the base-station runs out of the energy. This reduces the network lifetime maximization problem to minimizing the 
maximum energy expenditure at sensor nodes and the base-station. Sensor nodes expend energy in sensing, computing, and 
J> communication. In this paper, we are mostly concerned with reducing the energy cost of communication. We neglect the energy 
consumed by the nodes in sensing and computing because sensing costs are independent of the communication strategy being 
deployed and computing costs are often negligible compared to communication costs. 

The energy expended by a sensor node or the base-station in communication has two components: reception energy and 
transmission energy. The energy consumed in reception depends on the number of bits received and the per bit energy cost 
required to keep the receiver circuitry energized. The transmission energy depends on a number of factors such as transmit 
power levels, receiver sensitivity, channel state (including path loss due to distance and fading) and the kind of channel coding 
employed. In this paper, we assume that the data rates are low and that optimal channel coding is employed. Both these 
assumptions allow us to assume that the transmit power is linearly proportional to the data rate. Therefore, the communication 
energy is minimized by transmitting and receiving as few bits as possible. 

In this paper, we first develop a theory of worst-case, lossless interactive communication between multiple correlated 
informants and a recipient. Then, assuming that the sensor data in a data-gathering sensor network is correlated, we model 
the communication between sensor nodes and the base-station in a single-hop data-gathering wireless sensor network as the 
rn-message interactive communication between multiple correlated informants (sensor nodes) and a recipient (base-station), 
where at most m messages are exchanged between a sensor node and the base-station. Interactive communication helps the 
sensor nodes in reducing their energy consumption by allowing those to use multiple compression rates while transmitting their 
information to exploit the correlation in sensor data and by offering computationally inexpensive encoding schemes. Based 
on our work on "multiple correlated informants - single recipient" interactive communication, we then propose a formalism 
to estimate the optimal number of messages and bits exchanged, in the worst-case, between the base-station and the sensor 
nodes in a data-gathering network. Then, we apply this formalism to maximize the worst-case lifetime of the network, for 
different objectives of communication. We conclude by proposing a new clustering protocol for large sensor networks, based 
on interactive communication. 

To the best of our knowledge, our work for the first time addresses the problem of interactive communication between a 
recipient and a set of multiple correlated informants and then based on this formalism, proposes an alternative approach to 
enhance the lifetime of a data-gathering sensor network. 

II. Related Work 
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The "multiple correlated informants - single recipient" communication problem we are considering in this paper, is basically 
well-known distributed source coding (DSC) problem. This problem was first considered by Slepian and Wolf [1] for lossless 



compression of discrete random variables and by Wyner and Ziv [2] for lossy distributed compression. However, these work 
only provided theoretical bounds on the compression, but no method of constructing practical codes which achieve predicted 
theoretical bounds. 

One of the essential characteristic of the standard DSC problem is that the information sources, also called encoders or 
informants, are not allowed to interact or cooperate with each other, for the purpose of compressing their information. There 
are two approaches to solve the DSC problem. First, allow the data-gathering node, also called decoder or recipient, and the 
informants to interact with each other. Second, do not allow the interaction between the recipient and informants. Starting 
with the seminal paper [1], almost all of the work in the area of DSC has followed the second approach. In the recent past, 
Pradhan and Ramchandran [3] and later [4]-[9] have provided various practical schemes to achieve the optimal performance 
using this approach. An interested reader can refer to the survey in [10] for more information. However, only a little work [11], 
[12], has been done towards solving DSC problem when the recipient and the informants are allowed to interact with each 
other. Also, this work stops well short of addressing the general "multiple correlated informants - single recipient" interactive 
communication problem, which we are concerned with addressing in this paper. 

In [11], only the scenario in which two correlated informants communicate with a recipient is considered. It assumed that 
both the informants and recipient know the joint distribution of informants' data. Also, only the average of total number of 
bits exchanged is minimized. In [12], only two messages are allowed to be exchanged between the encoder and a decoder, 
which may not be optimal for the general communication problem. Conversely, it does not address the problem of computing 
the optimal number of messages exchanged between the encoder and a decoder as well as the optimal number of bits sent 
by the encoder and a decoder for the given objective of the communication in an interactive communication scenario. Also, 
unlike [11], this work concerns itself with the lossy compression at the encoders. 

In our paper, in section IIVI we first provide various formulation of the worst-case, lossless "multiple informants - single 
recipient" interactive communication problem for various objectives of communication. Then, in section |V] we explicitly 
estimate corresponding optimal number of messages and bits transmitted by both, the recipient and informants. We assume 
that the joint probability distribution of informants' data is available only at the recipient. Previously, [13]-[16] have addressed 
"single informant - single recipient" worst-case communication problem and attempted to bound its m-message complexity. 
In the same spirit, we attempt to solve here the worst-case "multiple correlated informants - single recipient" interactive 
communication problem. 

In data-gathering sensor networks, the sensor data is assumed to be correlated and only the data-gathering node needs to 
learn about sensor data. This makes the data-gathering sensor networks a canonical problem to which DSC can be applied. A 
broad survey of the DSC without interaction schemes applied to sensor networks appear in [10], and [17] makes a strong case 
for using asymmetric DSC codes, such as Turbo code, LDPC codes, and convolutional codes in sensor networks. However, 
in the general sensor networks such proposals may not be optimal. For example, in the cluster-based sensor networks, where 
different sensor nodes alternatively assume the responsibility of data gathering, such proposals may be impractical, given the 
limited computational and energy resources of the nodes. In [12], for the first time, it is proposed to use DSC with interaction 
in sensor networks to reduce the energy consumption at the sensor nodes. However, as mentioned earlier, their model of 
interactive communication is quite limited. Also, it does not directly relate the energy savings at the sensor nodes with the 
increase in sensor network lifetime. So, after introducing our system model for the sensor network in |VI] in section IVIII we 
apply the "multiple correlated informants - single recipient" interactive communication formalism developed in the previous 
sections to maximize the worst-case operational lifetime of the data-gathering sensor networks. We conclude by providing a 
new clustering protocol, based on interactive communication, for the large sensor networks and the simulations results clearly 
establishing the efficiency of our approach. 

A preliminary version of our ideas appears in [18], where we also extend the notions of ambiguity set and ambiguity, 
proposed in [13] and derive some of their properties. We intend to address the average-case communication problem and some 
other variations of the problem considered here, in future. 

III. Notation 

In this section, we introduce the notation that will be frequently used in the rest of this paper. 

S: the set of N informants. 

X: finite, discrete alphabet set. \X\ = n. 

V: A^-dimensional discrete probability distribution, V = p{xi, . . . , xn), Xi G X. 

11: the set of all A^! schedules to poll N informants. 

7r(i): the informant that is polled at position in the schedule tt. 

the set {7r(l), . . . , 7r(i — 1)} of informants who have already communicated their data to the recipient before the 

informant in the schedule tt. 

'S'x^(i) |x^(i),...,x^(i_i) (a^7r(i) I ■ • • ,a^7r(i-i)): the conditional ambiguity set of the recipient in informant 7r(i)'s data, when the 
data vector . . . = (x^(i),. . . We denote it as Sx^^i-^\XA 



/^^,(,)|x,(i),...,x,(,_i)(a;7r(i), ■ • ■ , XTri-i.-!))- the conditional ambiguity \Sx^^,^\XA^^,^ I- We denote it as ■ 
• the maximum conditional ambiguity. We denote it as /^jj^^^.j |Xa ■ 

IV. "Multiple Informants - Single Recipient" Communication Complexity 

Let us consider two persons Px and Py interactively communicating with each other. Px observes the random variable 
Xi £ X and Py observes a possibly dependent random variable X2 G aQ. Let us assume that only Py knows the joint 
distribution p{xi, X2)- In the worst-case, Py needs to send max(l, [loglog/2xi|X2l) bits to Px to help it send its information 
in [log/ijfjixal bits to let Py learn about Xi. However, we soon show that in the optimal communication protocol the recipient 
Py needs to send many more bits than the number of bits given above. A meaningful message from Px to Py reduces the 
ambiguity about Px's, data at Py. 

In the following, we generalize this discussion to "N multiple correlated informants - single recipient" communication 
problem and show that the interactive communication between the recipient and informants using prefix-free messages and 
instantaneous decoding [21], reduces this problem to a serial communication problem where the optimal schedule, in which N 
"single recipient - single informant" communication problems are solved, is to be computed. We consider the communication 
problems with different objectives and provide optimal protocols for the worst-case communication for each of those problems. 

Let us consider a set of N multiple correlated informants interactively communicating with a recipient, where the objective 
of communication is that the recipient must learn about each informant's data with no probability of error, but an informant 
may or may not learn about other informants' data. 

Communication takes place over N binary, error-free channels, where each channel connects an informant with the recipient. 
An informant and the recipient can interactively communicate over the channel connecting them by exchanging messages (finite 
sequences of bits determined by agreed upon, deterministic protocol), but the informants cannot communicate directly with 
each other (although, they can communicate indirectly via the recipient). So, if in an interactive communication protocol, the 
recipient and an informant exchange at most m-messages, then at most Nm messages are exchanged before the recipient 
learns of all informants' data. Each bit communicated over any channel, in either direction is counted. We want to estimate 
the optimal number of messages and bits exchanged, in the worst-case, for such scenarios. 

The problem of interactive communication between a single recipient and one or more informants has various variations, 
which are of interest depending on the communication scenario being modeled and some of these variations are already studied 
in existing literature. For example, depending on whether only the recipient knows the joint distribution of informants' data or 
both the recipient and informants know it, whether sum of the total number of bits communicated or the maximum number 
of bits communicated by any node is minimized, and whether this minimization is carried only over the set of informants or 
over the recipient and the set of informants, one can formulate different problems. There can be many more such variations, 
such as whether one considers lossless or lossy communication. However, in the present work we concern ourselves with some 
such variations. 

In this work, we assume that the joint probability distribution V = p{xi, . . . ,XN),Xi G X, of informants' data is only 
known to the recipient. Contrast this with the communication scenarios considered in [11], [13]-[15], where both, the recipient 
and informant know the joint distribution. However, note that [11], [13]-[15] only considered the "single recipient - single 
informant" communication problem. In the present work, we consider, the worst-case communication complexity of the four 
different problems corresponding to our assumption that only the recipient knows V. It should be noted that this assumption 
can also be made in the communication scenarios where even the recipient does not know V, as follows. Let us assume that 
at the beginning of the communication between the recipient and the informants, the recipient does not know V. However, 
as the recipient would collect the information from the informants drawn from P, it would eventually be able to estimate 
V. For example, in [12] a linear predictive model is used to estimate the correlation structure. So, once the recipient has the 
estimate of V, our formalism applies. We emphasize that we assume nothing about this distribution, except that it is a discrete 
distribution with finite alphabet. The underlying assumption of our work is that the correlation model is either already known 
to the recipient or can be learnt by it. However, in the communication scenarios where it is not so, our formalism does not 
apply. _ 

Let i?7r denote the total number of bits transmitted by the recipient, under schedule tt, to all the N informants, in the worst 
case. Let /7r(i)._R denote the number of bits transmitted by the informant 7r(i) to the recipient, in the worst case. Also, let m 
denote the total number of messages exchanged between the recipient and an informant, before the recipient unambiguously 



'in general, Xi G Xi and X2 G X2, where Xi and Xi are discrete alphabet sets, with possibly different cardinalities. However, to keep the discussion 
simple, we assume henceforth that all the random variables take the values from same discrete alphabet X, unless stated otherwise. 



learns of the informant's data. So, in the worst-case, we have the following four communication problems. 



minmin max I-^a^ n (1) 

m>l ttGH i=l,...,N ^ 

min min max(i?7r, max iTrd) b) (2) 

m>l Tren i=l,...,N ^ '' 

N 



minmin ^"7^(4).^^ (3) 

m>l Tren ^ '' 

i—1 

N 

min min(i?^ + I.^{i)^B.) (4) 

m>l Tren ^ '' 

^ 1=1 

Note that in above problem formulations, the first node 7r(l) in any schedule tt, sends its data uncompressed or at most 
compressed based on its past data. This node cannot exploit the data correlation structure to compress its data. 

V. Worst-Case Communication Complexity 

Let us consider a communication schedule tt 6 11. Let us assume that the informants 7r(l), . . . ,7r(i — 1) have already 
communicated their data to the recipient. Every informant 7r(i) knows that it needs to send its data in at most [logn] bits to 
the recipient, where n is the number of possible data values any informant's data can assume. The conditional ambiguity set 
of the recipient of informant 7r(i)'s data is Sx^^..^\Xa ' ^i'^h iix„^i^\XA — 

Before solving the problems in ([T]i, (|2|i, (|3]l, and (|4|i, we list without proof, the following properties of conditional ambiguity 
set, conditional ambiguity, and maximum conditional ambiguity, respectively. 

i-l 

'S'x^f.) \Xa^^^^ i^A^^,^ ) = n 5'x^(,) ) 

i=i 

A. Solution for ([T) 

Complexity of one-message communication: When the recipient and an informant are allowed to exchange only one 
message, then this message is from the informant to the recipient. As the informant in such scenario has no information about 
the ambiguity set of the recipient in its data, it sends [log n] bits to the recipient. In such situation, the solution to the problem 
in ([T]) is trivial, as any order in which the informants communicate with the recipient results in an optimal communication 
schedule. 

Complexity of two-message communication: With the recipient and an informant allowed to exchange two messages, the 
recipient sends the first message to the informant, then based on its own information and the information contained in the 
recipient's message, the informant sends the second message to the recipient. 

Given that the ambiguity set of the recipient of informant 7r(i)'s data is Sx^^-^ \Xa ^.j ' with maximum ambiguity /ix„(i)|Js:^ ^ .^ > 
in the worst-case, the recipient requires at least [log/Ix„(i)|jf^ ^.^1 bits to learn unambiguously about 7r(i)'s data. So, it is both 
necessary and sufficient that Tr{i) sends [log/Ix^(i)|j>f^ J bits to the recipient. However, to help 7r{i) send its information 



in just these many bits, the recipient informs it in /ix„(i)|XA ^.j [log"-! bits about those of its n possible data values which 
belong to Sx^(-^\Xa ^ j - Then, Tr{i) constructs the prefix-free codes corresponding to those data values and sends the code 
corresponding to its actual data value to the recipient in l^ogJix^^-^iXA 1 bits. 

Following this protocol to poll all the informants, the total number of bits transmitted by recipient under schedule tt, is 

N 

-Rjr = ^ ^?fl,7r(i) (5) 
1=1 
AT 

i=l 

The total number of bits transmitted by the informant 7r(?) is 

^r(i),i? = [log MX,,.) \Xa^^^^ 1 ■ (6) 

Rtt bits are sufficient for any model of correlation in the informants' data and necessary too for some models of correlation. 



Theorem 1: For to > 2, [log/2x^^ ^i^^ ] bits are both necessary and sufficient for the recipient to unambiguously learn 
about informant 7r(i)'s information. 

Proof: Omitted for brevity. ■ 
Corollary 1: Two messages are optimal. 

Proof: Previous theorem proves that [log/Ix„(i)tXA ^ ^1 ^^^^ from informant 7r(i) are both necessary and sufficient for the 
recipient to learn about 7r(i)'s data. Also, each informant sends this optimum number of bits even when only two messages are 
allowed to be exchanged between the recipient and the informant 7r(i). So, using the principle of Occam's razor, two messages 
are optimal. ■ 
We are interested in finding the schedule tt* that solves ([T]i. However, Theorem [T] and Corollary [T] reduce it to 

7r*=argmin max (7) 

The minmax nature of the problem in (Q ensures that the Minimum Cost Next (MCN) algorithm described below computes 
the optimal schedule in (jT). 

Algorithm: MCN 

1 Initialization: fc = 1, vI^mcjvjj.) — (p. 

2 while (fc < N) 

3 7r*'^C.^(fc) = argmin,gs_^^^^^„^^^ Z 



MCN 



i,R- 



4 A^MCN(^k+l) = ^7rMCJV(J^^) UTT ' (fc). 

5 fc = fc + 1. 



Lemma 1: MCN schedule solves 

Proof: We describe a procedure to modify a given schedule into another schedule such that value of the objective function 
does not increase. It will be apparent that iteratively applying this procedure on any schedule finally leads to the MCN schedule 
^MCN ^ Let tt'^^'^ be any schedule. Suppose it differs from n^^'^^ first in the position, that is; 

^Oi^(fc) = 7r*^C7iV(fc), l<fc<m-l (8) 
7rO^^(TO) ^ 7r*^c^^(TO). 

Then there exists a number I such that Tr^^^{l) — ■k'^'^'-^^ [m), I > to. We construct a new schedule tt^^'^ by modifying 
jj-OLD follows: 

7r™(fc) = 7r*^^^(fc), l<fc<TO (9) 

^7VEW(^) = 7rO^^(fc-l), TO<fc<? 

7r^^^(fc) = 7rO^^(fc), ;<fc<A^ 

In words, in tt^^'^^ we poll tt*^*^^ for first TO-slots, followed by tt^^^ for next N — m slots. 
In order to establish that n^^'^ is at least as good as tt*^^^, we need to show that 



From (|9]l, it follows that for 1 < j < to - 1 and I + I < i < N 
So, it suffices to show that 



(10) 



, „ < max 4oi.B(,)^. (11) 

i—m....,L ?— m,...,t 

Using a lemma in [18] that states that the conditioning reduces ambiguity, we have 

max I^NEw < max I^oldi^\ ^. (12) 

i—m+l,...,l i—m+l,...,l 

Moreover, the MCN construction ensures that 

/^NB»'(„j-) J^ < I^OLD(^„j^) Jj. (13) 

Equations ( fT2l l and ( |T3T l, imply (fTTb . proving the lemma. ■ 



B. Solution for ^ 

Complexity of one-message communication: The communication problem here is the same as the corresponding problem 
in subsection IV-AI Every informant sends [log n] bits to the recipient and the recipient sends no bits and any order in which 
the informants communicate with the recipient results in an optimal communication schedule. 

Complexity of two-message communication: Using the two message protocol of previous subsection IV-AI we see that for 
every recipient-informant communication pair, the number of bits i3fl;.7r(j) transmitted by the recipient in communicating with 
informant 7r(i) are always more than the number of bits /^(i) ^ transmitted by the informant 7r(i). This implies that 

> max I-„u),R. 

So, in this case (O reduces to finding a schedule tt that minimizes R-,^. However, as -B_R,,r(i) > riog*^]' is R-,^. So, the 
two-message complexity of this protocol for the problem in (|2|i, is more than the one-message complexity. This implies that 
this two-message protocol is not optimal. In the following, we prove that there is no two-message protocol whose complexity 
is less than the complexity of the one-message protocol given above. 

Theorem 2: There is no two-message protocol with complexity less than [logn]. 

Proof: Omitted for brevity. ■ 
Corollary 2: One message protocol is optimal for the problem in (|2]l. 

Proof: The proof follows from the last theorem. ■ 

C. Solutions for ^ and Q 

Due to the paucity of the space, we do not discuss the optimal solutions for the problems in ^ and (|4|l. 

VI. Sensor Network: System Model 

We consider a network of N battery operated sensor nodes strewn in a coverage area. The nodes are assumed to interactively 
communicate with the base-station in a single hop. Sensor node k,k € {1, . . . , N} has Ek units of energy and the base-station 
has Ebs units of energy. The wireless channel between sensor k and the base-station is described by a symmetrical path loss 
dk, which captures various channel effects and is assumed to be constant. This is reasonable for static networks and also for 
the scenarios where the path loss varies slowly and can be accurately tracked. 

The network operates in a time-division multiple access (TDMA) mode. Time is divided into slots and in each slot, the 
base-station gathers data from every sensor node. Let us assume that the sensor data at every time slot is described by a random 
vector {Xi, . . . , Xi\}) ^ V. This distribution is only known to the base-station. We assume the spatial correlation in the sensor 
data and ignore temporal correlation, as it can easily be incorporated in our work for data sources satisfying the Asymptotic 
Equipartition Property. 

We assume static scheduling, that is the base-station uses the same sensor polling schedule in every time slot, until the 
network dies. The worst-case lifetime of a sensor node (base-station) under schedule tt e 11 is defined as the ratio of its total 
energy and its worst-case energy expenditure in a slot, under schedule tt. However, as argued in Introduction, it is only the 
communication energy expenditure that we are here concerned with. We define network lifetime as the time until the first 
sensor node or the base-station runs out of the energy. This definition has the benefit of being simple, practical, and popular 
[19] and as shown below, provides a neat and intuitive maxmin formulation of the network lifetime in terms of the lifetimes 
of the sensor nodes and the base-station. 

To model the transmit energy consumption at the base-station and the sensor nodes, we assume that transmission rate 
is linearly proportional to signal power This assumption is motivated by Shannon's AWGN capacity formula which is 
approximately linear for low data rates. So, a node k under schedule tt expends i?7r(fe)rffc units of energy to transmit i37r(fc) 
units of information. Let denote the energy cost of receiving one bit of information. For simplicity, let us assume that it 
is same for both the base-station and the sensor nodes. 

The general sensor network lifetime maximization problem is to solve joint source-channel coding problem for multi-access 
networks. However, we assume the separation between source and channel coding, though it is well-known that, in general, 
the source-channel separation does not hold for the multi-access joint source-channel coding problem [20]. In this work, we 
assume that the optimal channel coding is employed. So, the general problem reduces to solving the distributed source coding 
problem to find the optimal rates (the number of bits to transmit), which maximize network lifetime. However, the optimal 
rate-allocation is constrained to lie within the Slepian-Wolf achievable rate region. This makes the problem computationally 
challenging. We simplify the problem by introducing the notion of instantaneous decoding [21] and thus reduce the optimal 
rate allocation problem to computing the optimal scheduling order, albeit at some loss of optimality. This loss of optimality 
occurs because, in general, turning a multiple-access channel into an array of orthogonal channels by using a suitable MAC 
protocol (TDMA in our case) is well-known to be a suboptimal strategy, in the sense that the set of rates that are achievable 
with orthogonal access is strictly contained in the Ahlswede-Liao capacity region [22]. 



VII. Maximizing Sensor Network Lifetime 

To begin with, let us assume that the interaction between the base-station and the sensor nodes is not allowed. Then, in 
the worst-case, every node sends [log n] bits to the base-station to convey its information. However, if every node knows V 
and the data of all other nodes, then it only needs to send the bits describing its data conditioned on the data of the nodes 
already polled [23]. In the real single-hop sensor networks, neither it is possible that every node knows about all other nodes' 
data, given the limited communication capabilities of the sensor nodes; nor it is desired that the sensor nodes perform such 
computationally intense processing, given their limited computational and energy capabilities. 

However, if we allow the interaction between the base-station and sensor nodes, then the nodes can still send less than [log n] 
bits, yet avoid above issues. In fact, this is precisely the "multiple correlated informants - single recipient" communication 
problem of section |IV] Using the results derived there and identifying the recipient as the base-station and informants as the 
sensor nodes, in the following, we attempt to maximize the worst-case lifetime of the single-hop sensor networks, for the given 
model of energy consumption and spatial correlation in the sensor data. 

The base-station and a sensor node interactively communicate by exchanging optimal number of messages for the different 
communication problems, given in section |IV] To estimate the worst-case lifetime of the sensor networks with the given 
objective of communication, we use the protocols in[V]for the base-station and sensor nodes communication. One of the major 
results of our work on the worst-case "multiple informants - single recipient" interactive communication problem is that for 
the formulations of this problem in ([T]l-(|4]l, it is the recipient which carries the most of the burden of communication and 
computation. So, in the context of the sensor networks, this implies that the corresponding role is played-out by the base-station. 
This reduces the energy consumption at the sensor nodes, hence enhancing their lifetimes, with concomitant increase in the 
network lifetime. For example, this is reasonable in the scenarios where the base-station is computationally and energy-wise 
more capable than the sensor nodes, as discussed in [17]. Still, it may not be infinitely more capable. So, in the network 
lifetime estimation problem, we consider the total communication (transmission and reception) energy expenditure at every 
sensor node as well as the base-station, to also include the situations where it is the base-station that runs out of the energy 
first. 

A. Worst-Case Network Lifetime 

Let Egs,TT{i) denote the energy that the base-station spends in communicating with node 7r(i) in the worst-case, that is, it 
denotes the energy that the base-station spends in transmitting and receiving the bits from node 7r(i), in the worst-case. So, 

EBS\Tr{i) = BBS,Tr{i)di + lTr{i),BsEr- (14) 

Similarly, let i?,r(i),BS denote the energy that the node 7r(i) spends in communicating with the base-station. So, 

-^7r(i),BS = lTr{i),Bsdi + BBS,TT{i)Er- (15) 

On substituting for i?B5.7r(j) ™d /7r(i),_BS from ^ and (|6]l, respectively, we have 

EBS^^ir) - K{r),BS = i^^X^^,■f\XA^^^^ \^ogn] (16) 

+ [loglog/ix,(,)|x^^^^,l 

-ri0g>X,(.)|X^^j^J)K -Er). 

Assuming di > Er, this implies that EBS,-K(i) — E^^^^i-^ BS ^ 0, that is, the base-station spends more energy in communicating 
with node 7r(i) than vice versa. 

Given our definitions of the sensor node, the base-station, and the network lifetimes, the worst-case lifetime L of the network 
is the solution to the following optimization problem 

EbS . ^7r(i) 



L = max min ( — -r; — — , min — — — — , (17) 

Tren Vy^" 1=1,. ..,n e , ■^ dc^ 

Z—ii=l ^BS,Tr{i) ^■K(t),BS 

L = mm max — , max — — . (18) 

Tren V Ebs t=i.....,N E^i^,) J 

As it was proven in section |V] that the interaction helps in solving the problems ([TJ and ([3]), in the following we estimate 
the network lifetime when the corresponding communication protocols are used in the network for the data-gathering. More 
precisely, we use the optimum two message communication protocol for the problem ([T]). However, before we discuss the 
general solution, let us consider its two special cases. 



Case 1: Let Ebs = Ei = . . . = Ef^ = E. This is so when + 1 identical sensors form a sensor cluster and one of those 
sensor nodes, is also chosen as the clusterhead. Then, the problem in ( fTsT l reduces to 



N 



i ^ = 4 min max ( V Ebs,tv{j,) , max E^(^^)^Bs) ■ 

i=l 

However, from (fT6l l. we know that X^ili ^BS.irii) > max^^i ... E^i^i-j BS^ so above equation reduces to 

^1 ^ ^ 

L ^ = T^min Vi;B5_^(,). (19) 

il/ ttGII — 
i=l 

In lemma 12] below, we prove that the Minimum Cost Next or MCN algorithm described in FVl computes the optimal lifetime for 
the optimization problem in ( fT9] l. 

Lemma 2; MCN schedule solves ^ ^ 

■ J2i=l EBS.Tv{i) 

TTsum = argmm (20) 

Proof: Changing the line|3]of the MCN algorithm in[V]to tt^^^^ {k) = argminjgg_^ SjeAui ^BS.j^ we obtain a version 
of the MCN algorithm that solves ( |20l i. Then proof is identical to the proof of Lemma [T] if in equations (fT0ll-(fT3ll. we make 
use of the following mappings: 



max 

i=l,...,JV 



AT 
i=l 



-^7r«-B"'(i),i? I > EBS,Tr"EW(^^), 

I-KOI^D(i).R I > EBS,TTOLD(^iy 

■ 

Case 2: Let Ei = . . . — En — E, but Ebs ^ E. This is so when the base-station is infinitely more capable than any of 
the identical sensor nodes. Then, ( fTSl l reduces to 

L^^ ~ minmax( Epiq — ^ max E^ii\ -aq 

= ^min max E^u),bSi ^o"^ Ebs '> E. (21) 

In lemma |3] we prove that the Minimum Cost Next algorithm above computes the optimal lifetime for the optimization problem 
in (l2n i too. 

Lemma 3: MCN schedule solves ^ 

ET,(i)^BS 

T^max = argmm max (22) 

Proof: Changing the line |3] of the MCN algorithm in[V]to n^^'^'^{k) = argminjg5_^ obtain a version of the 

MCN algorithm that solves ( l22T i. Then proof is identical to the proof of Lemma [T] if in equations (fT0b-(fT3]), we make use of 



the following mappings: 



f ETr'fEW(^i)BS 

i^NiSW(i) fl. I > — , 

^ ETr"^0(i),BS 

ETjOLDI^i) 



The general problem in ( fTTb or equivalently in ( fTSl ) can be solved as follows. It follows from Lemmas |2] and |3] that iTsum 
and Umax are the MCN schedules which optimally solve ( l20b and (l22l) . respectively. Let S^'^'-^^ = {t^ sunn max}- Then, ( fTSb 
reduces to: 



r-l • (Yji=l^BS,-n(t) E^(i)BS\ 

L = mm max — , max — — . (23) 

V Ers ^=1.....,N E^(.A I 



^(zsMCN V Ebs 1=1,...,^ iiTr(i) 

Theorem 3: L^^ in ( |23] | is optimal. 
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Max^* 



Max , 



Max„ 



Max„, I 



Max„ 



Max„, 



Fig. 1. Various orderings possible among S^», Max.^», ^nmax' '^^''^max' Sir^ax < Max„ 



Proof: We prove the theorem by contradiction. Let vr* ^ {TTsum,'^max} be the optimal schedule. Without the loss of any 
generality, let us assume that the schedule iTmax minimizes the RHS of (|23] |. However, with tt* being the optimal schedule, 
we have 



max — , max 



' < max 



E. 



\ EbS i=l,...,N / V Ebs 

For the sake of simplicity, let us use the following notation: 



i=l,...,N E. 



Trmax(i) 



(24) 



max 

Max^. 
Max7r„„^ 



EBi 



ETT*(i),BS 

max . 

^=1,...,N E^,(^o ' 



E. 



max 



7rmax(i),-BS 



1 = 1... .,N E. 



'7rmax(j) 



Once more without the loss of any generality, let us assume that for the schedule i^max 

S^,„„x < Max^„^„^. 

This along with (l24l i implies the six possibilities of relative ordering among S7r»,Max^», , Max^^^^, as in figure [T] 

It is obvious from figure [T] that in all cases, MaXjr* < MaXir^^^. So, using the sequence of steps in Lemma [51 we can 
iteratively convert the schedule vr* to the schedule Timax, without any loss of its optimality, proving the optimality of Umax 
with respect to maximizing the RHS of ( l23T l. ■ 

VIII. A New Communication Protocol for Sensor Networks Based on Interactive Communication 

In the section IIVI we mentioned that the assumption of instantaneous decoding reduces the "multiple correlated informants 
- single recipient" communication problem to a serial communication problem, where the recipient only after retrieving the 
complete information from one informant, polls the next informant in the polling schedule. So, for informants, N rounds 
of information gathering are serially executed. Using this protocol in the single-hop sensor networks introduces delay in data- 
collection at the base-station, which grows at least as N . This delay may be tolerable for small sensor networks, but most 
probably not for the large networks. In this section, we propose a low-delay communication protocol for arbitrarily large 
networks, based on the LEACH protocol [24]. Our protocol is same as LEACH in the cluster formation step, but differs from 
it in the data gathering step. So, in the proposed protocol, within a cluster, the clusterhead and sensors nodes communicate 



interactively using the formalism developed in the section IVIII As the data collection by the clusterheads in all clusters proceeds 
in parallel, this keeps the overall data-gathering delay at the base-station bounded. 

In LEACH, the sensor nodes do not compress their data, so if each sensor node's data is derived from some finite set with 
cardinality n, then every sensor node sends [log n] bits to the clusterhead and the clusterhead compresses the data and sends 
it to the base-station. The achievable compression-ratio r depends on the application and the type of data being sensed. 

Our protocol, like LEACH, can be extended to form hierarchical clusters in very large sensor networks. In such networks, 
the clusterhead nodes interactively communicate with super-clusterhead nodes and so on until the top layer of the hierarchy, 
at which point the data is communicated to the base station. Then, this hierarchy can save a large amount of energy, yet keep 
the data-gathering delay within tolerable bounds. 

IX. Simulation Results 

For the purposes of modeling and performance simulations, we assume that the sensor network consists of N sensor nodes 
uniformly distributed over a circle of radius R. The base-station is at the center of the circle. Each sensor node has at most n 
bits of data to send to the base-station. 

A. Correlation model 

As the model of the spatial correlation in sensor data, let us consider the first model of spatial correlation in sensor data 
introduced in [25], with ai = L0,/3i — LO. So, let us define B{Xi/ Xj), the number of bits that the node i has to send when 
the node j has already sent its bits to the base-station, as follows: 

BIXJXA = I "^'^ - " (25) 

where Xi be the random variable representing the sampled sensor reading at node i E {1, . . . , N}, n is the maximum number 
of bits that a node can send, and denotes the distance between nodes i and j. 

Let us define B{Xi/Xi, . . . , the conditional information when more than one node has already sent its information 

to the base-station, as follows: 

BiXjXi, = mill B{X,/X,) (26) 
i<j<i 

B. Comparisons with LEACH 

In this subsection, we compare the performance of the interactive communication protocol proposed in IVIIII with the 
performance of LEACH protocol. Figures |2] and [3] show that our proposed protocol, denoted as "MCN", performs much 
better than LEACH for compression ratio r > 0.2. 

Here we define the network lifetime to be number of data gathering rounds in which more than two nodes in the network 
are alive. In other words, the network is called dead when only two nodes are alive, one of these nodes in the clusterhead and 
other one is the sensor node. 

Figure |2] plots the number of sensor nodes which are still alive at the end of a certain number of data gathering rounds. In this 
plot, we compare the performance of our proposed protocol against LEACH protocol with compression-ratios, r — 0.1, 0.2, 0.5. 
The network started out with N = 100 nodes. This plot shows that as long as r > 0.2, the proposed protocol performs better 
than LEACH. 

Figure [3] compares the average achievable network lifetime for our proposed protocol and LEACH for different number of 
nodes in the network. For LEACH, we have set r — 0.5. Every data point in the plot corresponds to the network lifetime 
for the given number of nodes, averaged over 1000 instances. Note that as the number of nodes in the network increases, 
the achievable lifetime increases accordingly, but saturates at some value. For our proposed protocol, the increase occurs due 
to a couple of reasons. Firstly, as the number of nodes increases in the given geographical area, the distance between the 
sensor nodes and the clusterhead those are associated with, decreases. Secondly, this increased node density also increases the 
correlation in the sensor data, so every node has to send fewer bits to the clusterhead. So, as the number of nodes increases, 
each sensor node transmits fewer bits over smaller distances, on average. However, as LEACH does not exploit the correlation 
in sensor data to reduce the transmission energy budget of the sensor nodes, the increase in the network lifetime with it comes 
only from the decreasing average distance of the sensor nodes from their respective clusterheads. 

X. Conclusions and Future Work 

In this work, we have considered "multiple correlated informants - single recipient" interactive communication problem, 
assuming that only the recipient knew of the correlation structure of the informants' data. However, if we assume that informants 
also know the correlation structure, then the optimal number of bits exchanged can be significantly reduced, resulting in more 
efficient communication protocols. Also, we have only presented the worst-case analysis in this paper. However, in some 
communication scenarios, it may be more desirable to estimate the optimal number of messages and bits exchanged, on 
average. We are presently working on such extensions of our work and their application to sensor networks. 
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Fig. 2. Network lifetime comparison between MCN and LEACH with various compression ratios. 
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